NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A randomized controlled trial on anonymizing reviewers to each other in peer review discussions

https://doi.org/10.1371/journal.pone.0315674

Rastogi, Charvi; Song, Xiangchen; Jin, Zhijing; Stelmakh, Ivan; Daumé, Hal; Zhang, Kun; Shah, Nihar B (December 2024, PLOS ONE)
Bailey, Henry Hugh (Ed.)
Many peer-review processes involve reviewers submitting their independent reviews, followed by a discussion between the reviewers of each paper. A common question among policymakers is whether the reviewers of a paper should be anonymous to each other during the discussion. We shed light on this question by conducting a randomized controlled trial at the Conference on Uncertainty in Artificial Intelligence (UAI) 2022 conference where reviewer discussions were conducted over a typed forum. We randomly split the reviewers and papers into two conditions–one with anonymous discussions and the other with non-anonymous discussions. We also conduct an anonymous survey of all reviewers to understand their experience and opinions. We compare the two conditions in terms of the amount of discussion, influence of seniority on the final decisions, politeness, reviewers’ self-reported experiences and preferences. Overall, this experiment finds small, significant differences favoring the anonymous discussion setup based on the evaluation criteria considered in this work.
more » « less
Full Text Available
Debiasing Evaluations That Are Biased by Evaluations

Wang, Jingyan; Stelmakh, Ivan; Wei, Yuting; Shah, Nihar (February 2024, Journal of machine learning research)

It is common to evaluate a set of items by soliciting people to rate them. For example, universities ask students to rate the teaching quality of their instructors, and conference organizers ask authors of submissions to evaluate the quality of the reviews. However, in these applications, students often give a higher rating to a course if they receive higher grades in a course, and authors often give a higher rating to the reviews if their papers are accepted to the conference. In this work, we call these external factors the" outcome" experienced by people, and consider the problem of mitigating these outcome-induced biases in the given ratings when some information about the outcome is available. We formulate the information about the outcome as a known partial ordering on the bias. We propose a debiasing method by solving a regularized optimization problem under this ordering constraint, and also provide a carefully designed cross-validation method that adaptively chooses the appropriate amount of regularization. We provide theoretical guarantees on the performance of our algorithm, as well as experimental evaluations.
more » « less
Full Text Available
No Rose for MLE: Inadmissibility of MLE for Evaluation Aggregation Under Levels of Expertise

https://doi.org/10.1109/ISIT50566.2022.9834340

Rastogi, Charvi; Stelmakh, Ivan; Shah, Nihar; Balakrishnan, Sivaraman (June 2022, international symposium on information theory)

Full Text Available
No Rose for MLE: Inadmissibility of MLE for Evaluation Aggregation Under Levels of Expertise

Rastogi, Charvi; Stelmakh, Ivan; Shah, Nihar; Balakrishnan, Sivaraman (January 2022, International symposium on information theory)

Full Text Available
Debiasing Evaluations That Are Biased by Evaluations

Wang, Jingyan; Stelmakh, Ivan; Wei, Yuting; Shah, Nihar (May 2021, Proceedings of the AAAI Conference on Artificial Intelligence)

Full Text Available
Debiasing Evaluations that are Biased by Evaluations

Wang, Jingyan; Stelmakh, Ivan; Wei, Yuting; Shah, Nihar (January 2021, AAAI)
null (Ed.)
Full Text Available
PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review

Stelmakh, Ivan and (April 2019, Algorithmic Learning Theory)

Full Text Available
PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review

Stelmakh, Ivan; Shah, Nihar B; Singh, Aarti (March 2019, Algorithmic Learning Theory)

We consider the problem of automated assignment of papers to reviewers in conference peer review, with a focus on fairness and statistical accuracy. Our fairness objective is to maximize the review quality of the most disadvantaged paper, in contrast to the popular objective of maximizing the total quality over all papers. We design an assignment algorithm based on an incremental max-flow procedure that we prove is near-optimally fair. Our statistical accuracy objective is to ensure correct recovery of the papers that should be accepted. With a sharp minimax analysis we also prove that our algorithm leads to assignments with strong statistical guarantees both in an objective-score model as well as a novel subjective-score model that we propose in this paper.
more » « less
Full Text Available
Prior and Prejudice: The Novice Reviewers' Bias against Resubmissions in Conference Peer Review

https://doi.org/10.1145/3449149

Stelmakh, Ivan; Shah, Nihar_B; Singh, Aarti; Daumé, III, Hal (April 2021, Proceedings of the ACM on Human-Computer Interaction)

Modern machine learning and computer science conferences are experiencing a surge in the number of submissions that challenges the quality of peer review as the number of competent reviewers is growing at a much slower rate. To curb this trend and reduce the burden on reviewers, several conferences have started encouraging or even requiring authors to declare the previous submission history of their papers. Such initiatives have been met with skepticism among authors, who raise the concern about a potential bias in reviewers' recommendations induced by this information. In this work, we investigate whether reviewers exhibit a bias caused by the knowledge that the submission under review was previously rejected at a similar venue, focusing on a population of novice reviewers who constitute a large fraction of the reviewer pool in leading machine learning and computer science conferences. We design and conduct a randomized controlled trial closely replicating the relevant components of the peer-review pipeline with $133$ reviewers (master's, junior PhD students, and recent graduates of top US universities) writing reviews for $19$ papers. The analysis reveals that reviewers indeed become negatively biased when they receive a signal about paper being a resubmission, giving almost 1 point lower overall score on a 10-point Likert item (Δ = -0.78, 95% CI = [-1.30, -0.24]) than reviewers who do not receive such a signal. Looking at specific criteria scores (originality, quality, clarity and significance), we observe that novice reviewers tend to underrate quality the most.
more » « less

Search for: All records